PostgreSQL 数据库性能 函数稳定态影响分区表分区剪枝
1 本章背景知识
PostgreSQL 函数有三种稳定态,具体可以看 PostgreSQL 数据库性能 函数稳定性影响函数调用次数
函数稳定态影响SQL 索引使用、分区剪枝。
本文通过展示稳定态对于分区剪枝功能的影响。
2 环境准备
CREATE TABLE t01_part(id INTEGER,part_date DATE,name TEXT)
PARTITION BY range(part_date)
(
PARTITION part2023 VALUES LESS THAN (to_date('20240101000000','YYYY-MM-DD HH24:MI:SS')),
PARTITION part2024 VALUES LESS THAN (to_date('20250101000000','YYYY-MM-DD HH24:MI:SS')),
PARTITION part2025 VALUES LESS THAN (to_date('20260101000000','YYYY-MM-DD HH24:MI:SS')),
PARTITION part2026 VALUES LESS THAN (to_date('20270101000000','YYYY-MM-DD HH24:MI:SS'))
);
INSERT INTO t01_part SELECT generate_series(1,2000000),now() + random()*1000 ,md5(random());
这里构建了一个以时间为分区列的表。
以下例子通过传入不同时间函数参数,验证分区剪枝情况。
3 函数稳定态与分区剪枝
先来看需要用到的两个函数的稳定态:
SELECT proname,provolatile FROM sys_proc WHERE proname in ('now','clock_timestamp');
//屏幕输出:
proname | provolatile
-----------------+-------------
now | s
clock_timestamp | v
3.1 stable 函数可以使用分区剪枝
EXPLAIN SELECT * FROM t01_part WHERE part_date=now();
//屏幕输出:
QUERY PLAN
-------------------------------------------------------------------
Gather (cost=1000.00..38819.49 rows=10251 width=44)
Workers Planned: 2
-> Parallel Append (cost=0.00..36794.39 rows=4272 width=44)
Subplans Removed: 3
-> Parallel Seq Scan on t01_part_part2023 (cost=0.00..4975.73 rows=1 width=45)
Filter: timestamp without time zone = now()
(6 行记录)
可以看到 “Subplans Removed: 3”,也就是访问的分区扣除了3个。
3.2 volatile 函数无法使用分区剪枝功能
EXPLAIN SELECT * FROM t01_part WHERE part_date=clock_timestamp();
//屏幕输出:
QUERY PLAN
-----------------------------------------------------------------------
Gather (cost=1000.00..35718.11 rows=4 width=45)
Workers Planned: 2
-> Parallel Append (cost=0.00..34717.71 rows=4 width=45)
-> Parallel Seq Scan on t01_part_part2024 (cost=0.00..12226.83 rows=1 width=45)
Filter: timestamp without time zone = clock_timestamp()
-> Parallel Seq Scan on t01_part_part2025 (cost=0.00..12153.22 rows=1 width=45)
Filter: timestamp without time zone = clock_timestamp()
-> Parallel Seq Scan on t01_part_part2026 (cost=0.00..5361.91 rows=1 width=45)
Filter: timestamp without time zone = clock_timestamp()
-> Parallel Seq Scan on t01_part_part2023 (cost=0.00..4975.73 rows=1 width=45)
Filter: timestamp without time zone = clock_timestamp()
(11 行记录)
可以看到,SQL 访问了所有数据分区。